Influence of the Sparse Matrix Structure on Automatic Parallelisation Efficiency

نویسندگان

  • M. Ast
  • C. Barrado
  • J. Cela
  • R. Fischer
  • J. Labarta
  • O. Laborda
  • H. Manz
  • U. Schulz
چکیده

The simulated models and requirements of engineering programs like computational fluids dynamics and structural mechanics grow more rapidly than single processor performance. Automatic parallelisation seem to be the obvious approach for huge and historic packages like PERMAS. In this paper we evaluate how preparatory steps on the big input matrices can improve the performance of the parallelisation. We show that a preparatory blocking of the matrix saves storage and decreases the critical path length of the task graph when it is done with variable sized blocks. Also, a data distribution step is proposed that drives the modified dynamic scheduler. Results of this combination show an efficient parallelisation of the programs even on slow multiprocessor networks. Finally, the last step proposed is to interleave the array blocks that are distributed to different processors with post-ordering algorithm. This step is essential to expose the parallelism to the scheduler.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the generic parallelisation of iterative solvers for the finite element method

The numerical solution of partial differential equations frequently requires solving large and sparse linear systems. When using the Finite Element Method these systems exhibit a natural block structure that is exploited for efficiency in the “Iterative Solver Template Library” (ISTL). Based on existing sequential preconditioned iterative solvers we present an abstract parallelisation approach ...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

UCHPC – UnConventional High Performance Computing for Finite Element Simulations

Processor technology is still dramatically advancing and promises enormous improvements in processing data for the next decade. These improvements are driven by parallelisation and specialisation of resources, and ‘unconventional hardware’ like GPUs or the Cell processor can be seen as forerunners of this development. At the same time, much smaller advances are expected in moving data; this mea...

متن کامل

Parallel Geometric Multigrid

Multigrid methods are among the fastest numerical algorithms for the solution of large sparse systems of linear equations. While these algorithms exhibit asymptotically optimal computational complexity, their efficient parallelisation is hampered by the poor computation-to-communication ratio on the coarse grids. Our contribution discusses parallelisation techniques for geometric multigrid meth...

متن کامل

5 Parallel Geometric Multigrid

Multigrid methods are among the fastest numerical algorithms for the solution of large sparse systems of linear equations. While these algorithms exhibit asymptotically optimal computational complexity, their efficient parallelisation is hampered by the poor computation-to-communication ratio on the coarse grids. Our contribution discusses parallelisation techniques for geometric multigrid meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999